More pronounced aging effects, more frequent early-life failures, and incomplete testing and verification processes due to timeto-\nmarket pressure in new fabrication technologies impose reliability challenges on forthcoming systems. A promising solution to\nthese reliability challenges is self-test and self-reconfiguration with no or limited external control. In this work a scalable self-test\nmechanism for periodic online testing of many-core processor has been proposed. This test mechanism facilitates autonomous\ndetection and omission of faulty cores and makes graceful degradation of the many-core architecture possible. Several test\ncomponents are incorporated in the many-core architecture that distribute test stimuli, suspend normal operation of individual\nprocessing cores, apply test, and detect faulty cores. Test is performed concurrently with the system normal operation without any\nnoticeable downtime at the application level. Experimental results show that the proposed test architecture is extensively scalable\nin terms of hardware overhead and performance overhead that makes it applicable to many-cores with more than a thousand\nprocessing cores.
Loading....